Grounded Discovery of Coordinate Term Relationships between Software Entities
نویسندگان
چکیده
We present an approach for the detection of coordinateterm relationships between entities from the software domain, that refer to Java classes. Usually, relations are found by examining corpus statistics associated with text entities. In some technical domains, however, we have access to additional information about the real-world objects named by the entities, suggesting that coupling information about the “grounded” entities with corpus statistics might lead to improved methods for relation discovery. To this end, we develop a similarity measure for Java classes using distributional information about how they are used in software, which we combine with corpus statistics on the distribution of contexts in which the classes appear in text. Using our approach, cross-validation accuracy on this dataset can be improved dramatically, from around 60% to 88%. Human labeling results show that our classifier has an F1 score of 86% over the top 1000 predicted pairs.
منابع مشابه
تحلیل داده ها در روش تحقیق نظریه پایه
Grounded Theory is a qualitative research approach used to explore the social processes that present within human interactions. Glaser and Strauss (1967) developed the method and published the first text addressing method issues. Grounded theory includes systematic techniques and procedures of analysis that enable the researcher to develope a substantive theory. The discovery of a core varia...
متن کاملUsing Dependency Parsing and Probabilistic Inference to Extract Relationships between Genes, Proteins and Malignancies Implicit Among Multiple Biomedical Research Abstracts
We describe BioLiterate, a prototype software system which infers relationships involving relationships between genes, proteins and malignancies from research abstracts, and has initially been tested in the domain of the molecular genetics of oncology. The architecture uses a natural language processing module to extract entities, dependencies and simple semantic relationships from texts, and t...
متن کاملOn the autonomy of software entities and modes of organisation
As software becomes more complex and needs to operate in more open environments, the relationships between the encapsulated entities that constitute the software can become nondeterministic. In a number of branches of computer science, organisational mechanisms and structures have been seen as a way to coordinate the complex behaviour between software entities. In particular, organisational abs...
متن کاملSeeded Discovery of Base Relations in Large Corpora
Relationship discovery is the task of identifying salient relationships between named entities in text. We propose novel approaches for two sub-tasks of the problem: identifying the entities of interest, and partitioning and describing the relations based on their semantics. In particular, we show that term frequency patterns can be used effectively instead of supervised NER, and that the pmedi...
متن کاملGrounding Spatial Named Entities For Information Extraction And Question Answering
The task of named entity annotation of unseen text has recently been successfully automated with near-human performance. But the full task involves more than annotation, i.e. identifying the scope of each (continuous) text span and its class (such as place name). It also involves grounding the named entity (i.e. establishing its denotation with respect to the world or a model). The latter aspec...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1505.00277 شماره
صفحات -
تاریخ انتشار 2015